58 research outputs found

    Parallel performance prediction using lost cycles analysis

    Get PDF

    Going the distance for protein function prediction: a new distance metric for protein interaction networks

    Get PDF
    Due to an error introduced in the production process, the x-axes in the first panels of Figure 1 and Figure 7 are not formatted correctly. The correct Figure 1 can be viewed here: http://dx.doi.org/10.1371/annotation/343bf260-f6ff-48a2-93b2-3cc79af518a9In protein-protein interaction (PPI) networks, functional similarity is often inferred based on the function of directly interacting proteins, or more generally, some notion of interaction network proximity among proteins in a local neighborhood. Prior methods typically measure proximity as the shortest-path distance in the network, but this has only a limited ability to capture fine-grained neighborhood distinctions, because most proteins are close to each other, and there are many ties in proximity. We introduce diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks. We present a tool that, when input a PPI network, will output the DSD distances between every pair of proteins. We show that replacing the shortest-path metric by DSD improves the performance of classical function prediction methods across the board.MC, HZ, NMD and LJC were supported in part by National Institutes of Health (NIH) R01 grant GM080330. JP was supported in part by NIH grant R01 HD058880. This material is based upon work supported by the National Science Foundation under grant numbers CNS-0905565, CNS-1018266, CNS-1012910, and CNS-1117039, and supported by the Army Research Office under grant W911NF-11-1-0227 (to MEC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Measuring Bottleneck Link Speed in Package-Switched Networks

    Full text link
    The quality of available network connections can often have a large impact on the performance of distributed applications. For example, document transfer applications such as FTP, Gopher and the World Wide Web suffer increased response times as a result of network congestion. For these applications, the document transfer time is directly related to the available bandwidth of the connection. Available bandwidth depends on two things: 1) the underlying capacity of the path from client to server, which is limited by the bottleneck link; and 2) the amount of other traffic competing for links on the path. If measurements of these quantities were available to the application, the current utilization of connections could be calculated. Network utilization could then be used as a basis for selection from a set of alternative connections or servers, thus providing reduced response time. Such a dynamic server selection scheme would be especially important in a mobile computing environment in which the set of available servers is frequently changing. In order to provide these measurements at the application level, we introduce two tools: bprobe, which provides an estimate of the uncongested bandwidth of a path; and cprobe, which gives an estimate of the current congestion along a path. These two measures may be used in combination to provide the application with an estimate of available bandwidth between server and client thereby enabling application-level congestion avoidance. In this paper we discuss the design and implementation of our probe tools, specifically illustrating the techniques used to achieve accuracy and robustness. We present validation studies for both tools which demonstrate their reliability in the face of actual Internet conditions; and we give results of a survey of available bandwidth to a random set of WWW servers as a sample application of our probe technique. We conclude with descriptions of other applications of our measurement tools, several of which are currently under development

    Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes

    No full text
    Traffic that is bursty on many or all time scales can be described statistically using the notion of self-similarity, which i

    Long-lasting transient conditions in simulations with heavy-tailed workloads

    No full text
    Recent evidence suggests that some characteristics of computer and telecommunications systems may be well described using heavy tailed distributions — distributions whose tail declines like a power law, which means that the probability of extremely large observations is non-negligible. For example, such distributions have been found to describe the lengths of bursts in network traffic and the sizes of files in some systems. As a result, system designers are increasingly interested in employing heavy-tailed distributions in simulation workloads. Unfortunately, these distributions have properties considerably different from the kinds of distributions more commonly used in simulations; these properties make simulation stability hard to achieve. In this paper we explore the difficulty of achieving stability in such simulations, using tools from the theory of stable distributions. We show that such simulations exhibit two characteristics related to stability: slow convergence to steady state, and high variability at steady state. As a result, we argue that such simulations must be treated as effectively always in a transient condition. One way to address this problem is to introduce the notion of time scale as a parameter of the simulation, and we discuss methods for simulating such systems while explicitly incorporating time scale as a parameter.
    • …
    corecore